WHAT IS CLAIMED IS ; 

1 LA method for organizing cache memory for hardware acceleration of the FDTD 

2 method in a very high bandwidth, dual-port on-chip memory, comprising: 

3 creating a plurality of small banks of intemal memory; and 

4 arranging the pluraHty of small banks of intemal memory so that all data 

5 dependencies are capable of being statically wired. 

1 2. A method for organizing cache memory for hardware acceleration of the FDTD 

2 method in a very high bandwidth, dual-port on-chip memory, comprising: 

3 providing a first plurality of input memory banks that coimect to corresponding one- 

4 cycle delay elements; 

5 connecting the delay elements to corresponding computation engines; 

6 providing a second plurality of input memory banks that connect to corresponding 

7 computation engines; and 

8 connecting the computation engines to corresponding output memory banks. 

1 3. A method for organizing cache memory as recited in claim 2, wherein the first 

2 plurality of input memory banks handles fields having i and j directional dependencies. 

1 4. A method for organizing cache memory as recited in claim 3, wherein each of the 

2 first plurality of input memory banks includes a first channel that handles fields having i directional 

3 dependencies, and a second channel that handles fields having j directional dependencies. 
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1 5. A method for organizing cache memory as recited in claim 2, wherein the second 

2 plurality of input memory banks handles fields having k directional dependencies. 

1 6. A method for organizing cache memory as recited in claim 2, wherein the output 

2 memory banks buffer updated fields before storing the updated fields to a bulk memory. 

1 7. A method for organizing cache memory as recited in claim 2, wherein at least six 

2 output memory banks are provided. 

1 8. An organization scheme of cache memory for hardware acceleration of the FDTD 

2 method in a very high bandwidth, dual-port on-chip memory, comprising: 

3 a first plurality of input memory banks connected to corresponding one-cycle delay 

4 elements; 

5 a plurality of computation engines connected to corresponding delay elements; 

6 a second plurality of input memory banks connected to corresponding computation 

7 engines; and 

8 a plurality of output memory banks connected to corresponding computation engines. 

1 9. An organization scheme of cache memory as recited in claim 8, wherein the first 

2 plurality of input memory banks handles fields having i and j directional dependencies. 

1 1 0. An organization scheme of cache memory as recited in claim 9, wherein each of the 

2 first plurality of input memory banks includes a first channel that handles fields having i directional 
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3 dependencies, and a second channel that handles fields having j directional dependencies. 

1 1 1 . An organization scheme of cache memory as recited in claim 8, wherein the second 

2 plurality of input memory banks handles fields having k directional dependencies. 

1 12. An organization scheme of cache memory as recited in claim 8, wherein each of the 

2 plurality of output memory banks buffer update fields before storing the updated fields to a bulk 

3 memory. 

1 13. An organization scheme of cache memory as recited in claim 8, wherein the plurality 

2 of output memory banks comprises at least six output memory banks. 

1 14. A method of using an organization scheme of cache memory for hardware 

2 acceleration of the FDTD method in a very high bandwidth, dual-port on-chip memory, the 

3 organization scheme comprising: a first plurality of input memory banks connected to corresponding 

4 one-cycle delay elements, a plurality of computation engines connected to corresponding delay 

5 elements, a second plurality of input memory banks connected to corresponding computation 

6 engines, and a plurality of output memory banks connected to corresponding computation engines, 

7 the method comprising: 

8 loading dual fields of data into the first plurality of input memory banks, and 

9 simultaneously moving old values in the first plurality of input memory banks to the second plurality 

1 0 of input memory banks; 

1 1 loading primary fields of data into the second plurality of input memory banks; 
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1 2 beginning computations and iterating over the primary fields of data with the plurality 

13 of computation engines; 

14 storing updated fields in the plurality of output memory banks when updated fields 

15 emerge from plurality of computation engines; and 

16 writing updated fields to bulk storage. 

1 1 5. A method of using an organization scheme of cache memory as recited in claim 14, 



2 wherein the loading dual fields for data into the first plurality of input memory banks continues until 

3 the first plurality of input memory banks are fiilL 



1 16. A method of using an organization scheme of cache memory as recited in claim 14, 

2 fiirther comprising: 

3 determining whether the method is complete, wherein if the method is complete, the 

4 method stops, otherwise the method moves to the next data and repeats. 
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